Picture for Wangyou Zhang

Wangyou Zhang

Representation-Regularized Convolutional Audio Transformer for Audio Understanding

Add code
Jan 29, 2026
Viaarxiv icon

ICASSP 2026 URGENT Speech Enhancement Challenge

Add code
Jan 20, 2026
Viaarxiv icon

Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment

Add code
Jun 13, 2025
Viaarxiv icon

Interspeech 2025 URGENT Speech Enhancement Challenge

Add code
May 29, 2025
Viaarxiv icon

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech

Add code
Feb 13, 2025
Figure 1 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 2 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 3 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Figure 4 for ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
Viaarxiv icon

Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction

Add code
Feb 11, 2025
Figure 1 for Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Figure 2 for Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Figure 3 for Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Figure 4 for Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Viaarxiv icon

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music

Add code
Dec 23, 2024
Figure 1 for VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Figure 2 for VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Figure 3 for VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Figure 4 for VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Viaarxiv icon

Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling

Add code
Dec 19, 2024
Figure 1 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 2 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 3 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 4 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Viaarxiv icon

Text-To-Speech Synthesis In The Wild

Add code
Sep 13, 2024
Viaarxiv icon

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon